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In [A.W. Harrow and R.A. Low, Commun. Math. Pliys. 291, 257-302 (2009)], it was 
shown that a quantum circuit composed of random 2-qubit gates converges to an approximate 
quantum 2-design in polynomial time. We point out and correct a flaw in one of the paper's 
main arguments. Our alternative argument highlights the role played by transpositions 
induced by the random gates in achieving convergence. 



I. INTRODUCTION 

Quantum k-designs [l| are statistical ensembles over the sets of states or operators of a quantum 
system that faithfully reproduce the k^^ moments of the respective uniform distributions. These 
pseudo-random ensembles are of interest since they can often be efficiently simulated in a physical 
system. In other words, while physically generating random states or operators of an n-qubit 
quantum system requires resources that grow exponentially in n, pseudorandom objects may require 
only polynomial resources j2|. They are thus a practical tool for a wide variety of communication 
and computation tasks that make use of random quantum objects (e.g., js-S]). 

In ref. Harrow and Low (HL) have provided an example of an efficient construction of a 
quantum 2-design for operators of an n-qubit system, i.e., one that can be physically implemented 
using resources that scale polynomially with n. Unlike previous constructions with this property 
[7-91, their scheme appears to be efficient also for higher values of k llOl l. The construction is based 
on a random quantum circuit model [4I: at each step of the circuit, a pair of qubits is chosen 
at random, and a 2-qubit gate is applied to them, drawn from some ensemble /i over the set of 



all such gates. The pseudorandom n-qubit operators that result from this 



procedure have second 



moments whose evolution can be reduced to a classical Markov chain |12| |. In particular, the 
(approximate) convergence of this chain to its stationary state is sufficient to ensure the convergence 
of the pseudorandom operator ensemble to an approximate quantum 2-design Q. 

In this note we wish to point out and correct a flaw in a significant step of this analysis, on 
which the main results of ref. [Gj directly depend. Specifically, the proof of Corollary 5.1 (p. 284), 
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a statement concerning the number of steps required for the convergence of the Markov chain, 
is incorrect. We give an alternative argument showing that the statement itself is indeed valid. 
Our proof highlights the role played by transpositions induced by the random gates in achieving 
convergence. 

We assume that the reader is familiar with ref . [gI . In section [IT] we summarize some of its 
results, explaining where they are affected by the flawed step. In section IIIII we explain the flaw 
itself, giving an explicit counterexample. In section [IV] we give the general idea of our argument, 
and develop some preliminary results using standard tools from Markov chain theory and group 
representations. Section |V] contains our main result, with several details left to the Appendix. 



Following a strategy introduced in 



II. SUMMARY OF RESULTS IN 

, 1^, the first part of ref. [6!| establishes a map from the 
evolution of second moments of a random quantum circuit to a classical Markov chain P with state 
space rip = {0,1,2,3}". When the ensemble is chosen to be the uniform (Haar) distribution 
over C/(4), P turns out to have a particularly simple form, described by the following algorithm: 
given a position p= (pi, . . .pn) G ilp, choose a new position as follows: 

— choose randomly and uniformly a pair of indices 1 < i ^ j < n. 

— if Pi = Pj = 0, do nothing 

— if {pi,Pj) 7^ (0,0), replace the pair with any element of {0, 1, 2, 3}^\(0, 0), 
choosing uniformly from the 15 possibilities. 



(1) 



The corresponding Markov matrix P{p,p') has the form P = ^t^^j where Pij affects 

only the i,j coordinates of p. Apart from an isolated stationary state = (0...0), this Markov 
chain is ergodic, with stationary state given by the uniform distribution 

7r(p-) = (4"-l)-\VpGf)p\{0}. (2) 

The key technical problem is then to analyze the convergence time of P, measured for example by 
its mixing time in the trace norm: 

tmixp{£) ■■= max [min{t I ||P*(p, •) - vrll < e}] . (3) 

penp\o 

HL's approach is to concentrate first on the much smaller Markov chain Z which tracks the 
number of nonzero coordinates (i.e., the Hamming weight H{p) = \{i\pi ^ 0}\ ) oi states evolving 
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under the P chain. This 'zero chain' is ergodic on the state space = {1) • • • j '^jj with stationary 
state 

UH) = Henz (4) 

Its only nonvanishing transition probabiUties are (eq. 5.2 in 
Z{H,H + 1) = -H{n-H)Q , 

Z{H,H-l) = ^H{H-l)(^^y' , (5) 
Z{H, H) = l- Z{H, H-l)- ZiH, H + l) = l - - I) 



5n{n — 1) 

Determining a tight upper bound on the mixing time tmixzl^) of this chain turns out to be quite 
tricky. The main difficulty is dealing with states with small values of H, which by eq. ([5|) only have 
probability 0(l/n) of evolving. Nevertheless, after a laborious calculation, HL are able to show in 
Theorem 5.1 that tmixzi^) = 6(n log(n/e)). 

The next step in the analysis is the one that concerns us in this Comment. In Corollary 5.1, 
Harrow and Low state that, once the Z chain has approximately mixed, then 0{nln(n/e)) further 
steps suffice to ensure the convergence of the P chain as a whole, so that 

n 

Corollary 5.1 [6]: The full (P) chain mixes in time tmixpi^) = © (nlog (f ))• 

It is important to emphasize that, despite its moniker, this result is in fact an independent 

n 

theorem that does not follow automatically from other results in [6]. It is also a vital step in the 
main argument of the paper, as it implies immediately (see eq. 5.7 and Theorem 4.1) that the 
spectral gap A of the P chain is of order Q{l/n). This fact is, in turn, necessary for the main 
conclusions of the paper, viz. Theorem 2.2 giving the polynomial bound for the convergence of a 
random quantum circuit to a 2-design. 



Unfortunately, as we now show, the demonstration of Corollary 5.1 given in jg] is flawed. 

III. FLAW IN THE PROOF OF COROLLARY 5.1 

The argument given in [gS] is based on the well-known 'coupon collector' scenario , where 

one must complete a collection of n different coupons by acquiring them at random. In the present 
context, each 'coupon' corresponds to a coordinate i of p, which is 'collected' when it is first chosen 
in eq. ([T|) together with another j such that {pi,Pj) ^ (0, 0). HL carefully show that, if the Z chain 
has already converged, then after 0(nln(n/e)) circuit steps, the probability that all coordinates 
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have been 'hit' in this sense is greater than 1 — e. The crux of their argument is however the 
following statement (p. 284): 

Once each site of the full chain has been hit, (...) the chain has mixed. This is because, 
after each site has been hit, the probability distribution over the states is uniform. 

Indeed, if this were true, then standard results, based on the concept of a 'strong stationary time' 
(SST)^ would allow the bound on the 'album completion' time r to be converted into one on the 
P chain's mixing time. Unfortunately, however, the quoted statement is incorrect: the probability 
distribution conditioned on all sites being hit is in fact not uniform, and an SST-type argument 
cannot be used. 

This is already apparent in eq. ([1]): note that, conditioned on a site i having just been hit, its 
value Pi has probability 1/5 of becoming and 4/15 of becoming 1,2 or 3. In particular, since this is 
true of the last site to be hit, the overall distribution for p conditioned on all sites being hit cannot 
be uniform. 

One can also construct an explicit counterexample. Choose for example n = 3 qubits (the 
simplest nontrivial case) and initial state y = (0 1). Starting from y, consider those evolutions 
such that all three sites are 'collected' after two circuit steps. By exhausting all such cases, it is 
straightforward to check that the conditional probability of reaching each final state is not uniform. 
For example: the probabilities of obtaining (1 0) or (0 1) have a ratio 3:2. 



IV. ALTERNATIVE STRATEGY AND SYMMETRY ANALYSIS 



While it is conceivable that, with appropriate tweaking, an SST-based argument might still be 
found for Corollary 5.1, we have been unable to do so. We propose instead a different strategy, 
based on reducing the analysis of the P chain to that of another well-known problem in Markov 
chain theory: the repeated random transposition of n objects. Note that other kinds of argument 
may also be possible, for instance via coupling (A. Harro w, p rivate communication). 



Much is known about the random transposition chain 



w, p rivate 



in particular, P. Diaconis and 



collaborators have shown that it converges to within e of a random permutation after 0(nln(n/e)) 



An SST [13 . [l5| is an instant r when the distribution Xt of the chain conditional on a certain event occurring 
matches the stationary one tt. More precisely, Xt must be obtained independently of r, and of the initial state y 
of the chain, ie: Py{XT = x,t = t} = ''^{x)PyjT = t}. Under these circumstances a bound on the mixing time can 
be established (see, e.g., Proposition 6.10 in . 
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steps. ^ In order see how this result appHes to the problem at hand, let us define the set of states 
sharing the same Hamming weight H: 

GH:={p\HiP) = H}. (6) 

Since the Z chain mixes after B(nln(n/e)) circuit steps, then at that point the total probability 
for each Gh is approximately correct. However, the probability distributions within each set may 
still be uneven, and so it is not yet possible to ensure that the full P chain has mixed to its uniform 
stationary state. 

Note now that all elements of Gh are equivalent up to permutations of their indexes and/or 
of the values 1,2 or 3 of their nonzero coordinates. One can thus expect that applying a random 
permutation of these variables will result in the mixing of P. Lemmas [T] and [2] below show that 
this is indeed true. 

The remaining question is then: how do we ensure that such a permutation is applied? A simple 
way is to do it 'by hand'. For example, once the Z chain has mixed, we can apply an efficient 
permutation-generating algorithm such as the Durstenfeld-Knuth shuffle which requires 0(n) 
transpositions to generate an exactly randomly distributed permutation of the indexes of p. In 
physical terms, each transposition can be implemented by a SWAP gate on the corresponding 
qubit pair. Subsequently, all we need is to apply independent permutations of the values 1,2,3 on 
each site. These can all be done in parallel, by applying a random choice from the set of Pauli 
rotations {ai}f^^ on each qubit (compare e.g. the Ci/Pi-twirl in The overall number of circuit 
steps for the entire algorithm is therefore still 0(n ln(n/e)). Once this is done, the remainder of the 
argument in [0] implies that an approximate quantum 2-design will indeed have been generated. 

Of course, following this strategy requires switching mid-way from the 'pure' random quantum 
circuit model described by Harrow and Low to a different algorithm. This is irrelevant if all that is 
required is an efficient means of generating a 2-design. Our interest here, however, is to show that 
the same result is also achieved within the original random circuit model. Specifically, in section |V] 
we will show that, once the Z chain has mixed, the P chain itself performs the role of a random 
transposition chain. Diaconis et al's results then ensure that P mixes in 0(nln(n/e)) additional 
steps, and so the overall number of steps will also be of order @{n\n{n/e)). 

Before we formalize these ideas, it is useful to exploit the symmetries of P in order to reduce its 
analysis to that of a simpler chain, which we call Q. This requires some elementary results from 



In fact, much sharper statements can be made Q, [ij , but these are not necessary here. 



the application of group representation theory to Markov chains [itI. 

Markov Chain Projections: Suppose a Markov chain M, with state space ^Im, is invariant 
under an action of some group G, i.e.: M{g{x), g{y)) = M(x,y),\/g £ G,yx,y £ $7m- If Ga,Gb ^ 
Qm are orbits induced by the group action, then the rule 

N{Ga,Gb) := Yl ^i^^yh xeGa (7) 

defines^ a new Markov chain N over the set of all orbits, {Gj} = 0,^- This 'projected' chain can 
be seen as a coarse-graining of the original one. Every probability distribution fJ.{x) over $7^/ has 
a natural projection f^(a) = YlxeGa ^(■^) '^^^ particular, if is a stationary distribution for 

M, then is a stationary distribution for N. Also, every eigenfunction h of N can be lifted onto a 
corresponding eigenfunction / of M, with the same eigenvalue, defined by /(x) := h{a), Vx G Ga 
(see e.g. Lemma 12.8 in fl^). The converse is, in general, not true, since eigenfunctions of M can 
project to zero. Thus the projected chain can have fewer eigenvalues than the original Q, and 
simpler dynamics. In particular, if both chains are ergodic, mixes at least as fast as M. 

The Z chain is an example of a projection of P. By eq. ([TJ, the transition probabilities P{p,p') 
of the P chain are insensitive to whether the nonzero coordinates of p and j/ are equal to 1,2 or 
3 (they only distinguish these values from 0). They are also invariant under permutations of the 
indexes oi p,f/. The group subsuming both these symmetries is isomorphic to the wreath product^ 
S^lSn- The corresponding orbits in ilp are precisely the sets Gh, and the projected chain resulting 
from eq. ([7]) is the Z chain. 

Q chain: It is useful to define a less coarsely-grained projection of P, which we call Q, with state 
space Qq = {0, 1}"" (the vertices of a unit hypercube). Consider the action on J^p by the subgroup 

C S3I Sn formed by independent permutations of the values 1, 2 and 3 of each coordinate of p. 
The resulting set of orbits is isomorphic to Qq, under the bijection q -f-)- G^- = {p\pi = 04^qi = 0}. 
By eq. ([7]), the corresponding projected chain is 

with stationary state on r2Q\{C)} given by the projection of vr in eq. ([2]): 

-^(^) = 4^3^(«^ (8) 



^ Note that this sum is independent of the choice of x. 

■* This is the semidirect product SJ yi^ Sn, where is the natural homomorphism of S" induced by elements of Sn- 
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Like P, Q may be written as a convex sum Q = Q^^'^\ where each Q^^'^\q, q') vanishes 

except for pairs q,q' that differ only at coordinates i and j. When restricted to these coordinates, 
the matrix Q^^'^^ always has the same form, given in table H 

The reason for defining Q is that, despite being a projection of P, the two chains have completely 
equivalent dynamics - we can therefore restrict ourselves to studying the simpler chain^ . As we now 
show, this happens because the P chain does not distinguish between the elements within each 
orbit G^. 

Lemma 1 The mixing times tmixQi^) (^nd tmixp{^) oife equal for all e > 0. 

Proof: Since P is a reversible Markov chain, the t*^ power of its matrix can be expanded as 

|Qp| 

where fj : Qp -)■ RI^^I € P {n) are the eigenfunctions of P, with corresponding eigenvalues A,-, 
and which are orthonormal with respect to the stationary measure vr (see Lemma 12.2 in [l^ ). 
Similarly, 

Q\q,<f) = E hj{q)h,{(^)a] 

where hj G P {v-w), aj are the eigenfunctions of Q and corresponding eigenvalues. As previously 
noted, each hj can be lifted to a corresponding fj with same eigenvalue, given by fj{p) = hj{q),yp £ 

Note now that, by eq. ([1]) 

P{Pl,Pl) = P{P2,P2)'^ Vpi,P2 G Gg-, Pi,P2 e Gg'. (9) 

In other words, P can be written as a block-constant matrix, with rank equal to that of Q. This 
implies that the eigenfunctions 'lifted' from hj are the only eigenfunctions of P with non-zero 
eigenvalues. For each p G Gg-, we have then 

\Clp\ 



\p\p,p)-<p)\= E <p) 



: E \Q\Q,^)-'^ni<f)\ 



\^q\ 

Y,h,{q)h,{ci)X]-l 



(10) 



^ A related strategy is used in jlli] 
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since there are S^*-^-* elements in G^. Finally, since the orbits for q ^ partition r2p\{0}, we 
obtain the desired result by substituting in eq. (|3|) □ 

Note that the Z chain is also a projection of the Q chain under the natural action of Sn on fig, 
with orbits Gh- Thus every probability distribution ^{q) over r2Q\{0} has a projection Cu{H) = 

We are now ready to formalize the intuitive argument given at the beginning of this section. 
Given a permutation a £ Sn, let be its natural representation as a Markov matrix acting on the 
space of probability distributions over ^ 

[uA^]{q):=u{a{q)) (11) 

where = ^7-i(t) is the natural action of a on 0,q. We can extend this representation to 

any probability distribution over 5„ by taking convex combinations of the A^. In particular, the 
uniform distribution is represented by the Markov matrix S = X^o-eSn ^f^' 

The following lemma shows that applying this random permutation to any distribution v over 
Qq brings it as close to the stationary state of Q as its projection (^i, is to the stationary state of Z. 

Lemma 2 \\iyS - i^ttWtv = llCi^ ~ CnWrv 

Proof: By definition the orbits Gh are invariant under permutations, so 

E i^sm = ^ E E = E = Cv(i^) 

q&Gn creSn q&Gh Q^Gh 

Also, since SAu = 5, Vex, then vS is a constant function on Gh- 

yS{q)=^-^,'iqeGH. (12) 
By eq. ([8]), the same is also true for v-,^. Thus, using also eq. ([7]): 
\\vS -u.„\\j,y=]- \vS{q) -v^{q)\ 



2 



- E 

Henz 



E ^'^(^ ~ E ^'^(^^ 

qeGn qeGH 



WCu - CttWtv n (13) 



^ Here, as is usual in the Markov chain literature ^T3\ , ^ is a row vector and the Markov matrix Aa- acts on the left. 
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00 


01 10 11 


00 


1 





01 





1/5 1/5 3/5 


10 





1/5 1/5 3/5 


11 





1/5 1/5 3/5 





00 


01 10 11 


00 


1 





01 





1/4 3/4 


10 





1/4 3/4 


11 





1/4 1/4 1/2 



TABLE I: Transition probabilities Q^'^^H^^j. and M^'"i^q^q■j , q'^q'^) . On the left 
column we have the initial values qiqj and on the top line the final values q[q'j. 



PROOF OF COROLLARY 5.1 IN 



In this section we show how the Q chain itself induces a random permutation of the indexes of 
q, and how this leads to our desired result, Corollary 5.1 oflGj. Let us begin by introducing the 
random transposition chain T studied by Diaconis et al Consider a set of n different 

objects occupying n positions, and subject to the following evolution rule: at each step, two values 
^ ^ i, j ^ n are selected independently at random, and the objects at these positions are swapped. 
If i = j, nothing happens. Formally, this can be seen as a random walk on Sn, with transition 
probabilities between permutations a and p given by 



r(fj, p) = T {pa ^) 



(14) 



where r is the probability distribution over Sn defined by 



T{a) 



1/n, a = I 

2/n^, a is a transposition 
0, otherwise. 



(15) 



This chain is ergodic and converges to the uniform distribution. As we have already mentioned, 
Diaconis et al. showed that this occurs with mixing time 



tmixT{e) = 0(nln(n/e)). 



(16) 



Returning now to the components Q'-*'-'^ of the Q chain (see table [T]), notice that each can be 
rewritten as the convex sum 



/n(«'i) = It^C^j) _|_ ^M^hj) 
^5 5 ' 



(17) 
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where T^*'-'^ represents the transposition of coordinates i and j and M^*'-^^ is still a Markov matrix. 
Thus, Q can be seen as the combination of two Markov chains 



Q = ^Tp + (18) 



where T, = ^ E.^, ^^^'^^ and M ^ E.^, Af (^'^^ respectively. 

The Tp chain represents a random transposition of the components of q. Though similar to T, 
it is based on a different representation of the permutation group: here the transpositions T^*'-') act 
on the state space Qq, and not 5„ itself. As a result, Tp is reducible to independent chains on each 
of the orbits Gh- Furthermore, since Tp lacks the identity component present in eq. (jlSp . an even 
(resp. odd) number of steps will always lead to an even (resp. odd) permutation of q. Thus Tp is 
a non-convergent, periodic chain. 

The latter difficulty can be easily removed by rewriting eq. (jlSp as 



Q = ^T+^M (19) 

where T = + ^^^Tp is now aperiodic, and M = M + ^ [Tp — I], is an ergodic Markov chain on 
Oq\{0} for n > 3. 

Alternatively, T may also appear in eq. (fT8|) if we modify the definition of the two-qubit gate 
ensemble fi, allowing at each step an extra probability 1/n of applying the identity gate. In this 
case, the P chain in eq. ([T]) becomes P' = + ^^^^P. The corresponding modification of Q leads 
to 



g' = -r + - 

^55 



1 . 

-IH M 

n n 



-T+-M'. (20) 
5 5 



The T chain is not ergodic, as it is still reducible into independent chains Tjj on each orbit 
Gh- In particular, T does not have a unique stationary state. Nevertheless, it does converge to the 
random permutation S over Oq, and the mixing time given in eq. (jl6p is still valid, in the following 
generalized sense: 

Lemma 3 Each initial distribution v on Qq converges under T to its randomized version vS. In 
addition, eq. \10^) remains valid under the generalized notion 

tmixTie) = max [mini |||i/r* - J^^H^^ < e] . (21) 
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This is true since it can be shown [20j| that i) M is ergodic on S7(5\{0} and ii) its eigenvalues are lower- bounded 
by —2 — ___!—_. Thus the eigenvalues of M are all > — 1 for n > 3. 

3 3(n — 1) ° — 
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Proof: This follows from the fact that each Th is isomorphic to a projection of T in the sense of 
eq. ([7j). See the Appendix for details. 

Turning now to the M chain, note from its definition that it is symmetric under permutations of 
the site indexes, and in particular under transpositions. Thus M (or its variants M, M') commutes 
with T. 

This property gives us an intuitive picture of how the Q chain behaves. According to eq. (jl9p . 
each step of Q can be seen as a random choice between moving according to T or to M. A sequence 
of t steps will, for large enough t, contain roughly t/5 steps of T and 4t/5 steps of M. Note that 
the latter are the only steps where the Hamming weight can change, and thus only they contribute 
to the convergence of the zero chain Z. Moreover, since the T and M chains commute, we can 
consider that all these M steps happen first. Once Z has converged, all we need is to wait for the 
subsequent T steps to build up to a random permutation of site indexes. Lemmas [1] and [2] then 
ensure that the full P chain will have converged. 

Let us now formalize this argument 

Proof of Corollary 5.1 in [9| 

Let 7 > 0, and let to = tmixzij), so that the state of the Z chain after to steps satisfies 
lie ~ CttIIti/ < 7- By Lemma[2l the corresponding state ly of the Q chain at that moment lies within 
the ball 

i?(7):={z^| \WS-i.^\\Ty<j} (22) 
Define now a mixing time for Q for initial conditions restricted to this ball 

tmixQ{£,7) ■= max [mint I lli/Q* - i^nlLy < e] (23) 
In the Appendix, we show that this time is bounded by the mixing time of T: 
Lemma 4 



(24) 



for all £ > and all j > 0, 5 > satisfying e > e + 7. 



Choosing 7 = e/2, 5^ = | ln(4/e) gives 

, 25 

tm.ixQ{E,£/2) < 



1 4 

- ln(4/e) + - tmixT (e/4) 

2 5 
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Taking into account eq. (jl6p and the fact that ln(4/e) < nhi(4n/e) ,Vn > 1, it follows that there 
exists an integer K such that 

tmixQ{e,e/2) < KnlniAn/e) 
Thus the mixing time for the entire Q is 

tmixQie) < tmixz{e/2) + tmixQ{£,£/2) = @{nln{n/e)) 

n 

where we use the fact (Theorem 5.1 of ^]) that tmixzil) = Q{nln{n/^)). Finally, applying Lemma 
[1] proves our desired result □ 
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Appendix: Proofs of Lemmas 3 and 4 
1. Proof of Lemma [3] 

Since all elements of Gh are equivalent under permutations, and transpositions generate all 
permutations, Th is irreducible; it is also aperiodic due to the identity component in T. Thus, Th 
is ergodic, and it is easy to see that its stationary state is the uniform distribution over Gh- In other 
words, any initial distribution uh over Gh converges to i^hS (see eq. (fT2l) ). Since T = ^h Th, the 
same is true for any initial distribution v over fig. 

Let us now link Th with Diaconis' T chain. By the orbit-stabilizer theorem, Gh is isomorphic 
to the quotient Sn/Nn, where Nh C Sn is the stabilizer of some element G Gh- Explicitly, 
we identify x G Gh ^ dx^H, where gx is any permutation such that gx{x^) = x. Since Th can 
be described using the probability distribution r in eq. (jlSp . but with the transpositions acting on 
Gh, it follows (see, e.g. Lemma 3 in section 3F of |17l |) that its transition matrix is 

TH{x,y) = TigyNng^^) = T{gx,9yNH) (AT) 

where we have used eq. (I14p . Note now that T is invariant under the action of Nh on Sn given by 
h[g) = gh. The set of orbits of this action is precisely Sn/Nn — Gh- Comparing eq. ([AT]) and 
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eq. d?!), it is clear that Th is (isomorphic to) the projection of T with respect to this action. Thus, 
as discussed in section IIV| the mixing time for Th is at most equal to that of the T, in eq. (I16p . 
Finally, the same is true for T since T = ^H- 



2. Proof of Lemma [4] 



Let p = 1/5. After t steps of Q, the TV distance to the stationary state z^jr is, from eq. (j20p : 

t 



d{t) 



I J^Q^ - ^-K I 



TV 



i=0 



TV 



TV 



(A.2) 



In the first equation we have used the fact that T and M commute, and in the second the triangle 
inequality, and also the facts that i/^ is the stationary state for M, and that applying an ergodic 
Markov matrix can never increase the TV distance to its stationary state. 

Let us now split eq. (IA.2j) into two sums di{t), d2{t), containing respectively terms with i < 
pt — 6\/t and i > pt — 6\/t, where (5 > is some constant such that t > {6/p)'^. In order to bound 
di{t) we can use the fact that TV distances between probability distributions are always < 1, so 

[pt~sVt\ 



di{t)< 

i=0 

This is a sum of terms in the tail of the binomia. 



t-i 



t > (S/p)'^ using e.g. the Hoeffding inequality 2l|| 



p\l-p) 



distribution, which can again be bound, for any 



di{t) < exp(-2(^2)_ 

We can also bound d2{t), as follows: since v G B{'~f), then for each value of i: 

||z^r* — J^7r||j.y < W'^T'' — J^S\\j,y + \\iyS — i^ttWtv - Ik^* ~ ^"^llTy 

Furthermore, since T is ergodic on each orbit Gh, and the initial state 1/ converges to vS by 
^emma [3l then the TV distance with respect to this state is non-increasing at each step of chain 
1^ . Thus, all TV distances in d2 are at most equal to that of the term with the smallest value 

i= ^pt- 5^/t\ + 1: 



d2it) < 



< 



i=[pt-sVi\+i 



^T[pt-SVt\+l _ 



TV 



+ 7 



TV 



+ 7 
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since the sum is over part of the binomial distribution. Combining both bounds: 



dit) < exp(-2(5^) + 7 + 



TV 



(A.3) 



Given now any e > 0, choose j,S > satisfying e ^"^^ + 7 < e, and choose also t to be the first 
instant for which 



max 



TV 



(This instant exists, by Lemma [3|). Substituting in eq. (IA.3|) and using eq. ([23|) : 



(A.4) 



(A.5) 



Define now, in analogy to eq. (p3 



tmixT{e,l) max [mint I lli/T* - j^SlI < e] < tmixTie) 



(A.6) 



with the inequality resulting since trnixTi,^^ nictximizes over a, larger set. Then we can restate 
eq. ()A.4p as 



+ 1 = tmixT e - 7 - e 



pt - 5\/t < pt- dVt 



This inequality, which is quadratic in ^/t, can be inverted to give 



,7 



yft < 



2p 



S+J6^ + Ap tmixT (e - 7 - , 7) 



Using eqs. (jA.SP and ()A.6P we obtain the relation between the mixing times of Q and T: 



tmixQ{£,l) < ^ 



□ 
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